-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OP_SUBSTR_LEFT - a specialised OP_SUBSTR variant #22785
base: blead
Are you sure you want to change the base?
Conversation
d6f958e
to
f258d43
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of small comments but overall nothing troubling-looking here.
I wonder a bit about the name though. I've usually seen the word "nibble" to mean a half-byte; i.e. a 4-bit value. I wondered if that is what is going on here at first. If there are other candidate names to call it, perhaps something else would be better? Not a huge problem though.
How about Food related alternatives: |
Not sure what's going on with the ABRT test failures. Don't get them locally. |
Looks like an op_private flags assertion. I'll dig into it soon. |
f258d43
to
98d187d
Compare
I'm rebasing and renaming it to |
Doesn't perl's |
The Perl |
@richardleach , merge conflicts ^^ |
98d187d
to
24108a7
Compare
Oh wow. Huh. In that case, might as well call this one Otherwise my thoughts were going to be something like |
Consider ltrim, with inspiration from PHP and Redis (or lstrip a la Ruby/Python but that sounds more whitespace-specific). Though it is also unrelated to builtin::trim, I think it's a bit more descriptive at least |
Hmmm, I'm not sure about this. It seems only more descriptive to someone who already is familiar with |
Maybe |
Ok, that seems straightforward enough without colliding with Perlspace. Will rename. |
24108a7
to
2712d8f
Compare
Variants are named to match the style of macros in op.h
2712d8f
to
3c55fa6
Compare
3c55fa6
to
c468cc5
Compare
OP renamed to |
This commit adds OP_SUBSTR_LEFT and associated machinery for fast handling of the constructions: substr EXPR,0,LENGTH,'' and substr EXPR,0,LENGTH Where EXPR is a scalar lexical, the OFFSET is zero, and either there is no REPLACEMENT or it is the empty string. LENGTH can be anything that OP_SUBSTR supports. These constraints allow for a very stripped back and optimised version of pp_substr. The primary motivation was for situations where a scalar, containing some network packets or other binary data structure, is being parsed piecemeal. Nibbling away at the scalar can be useful when you don't know how exactly it will be parsed and unpacked until you get started. It also means that you don't need to worry about correctly updating a separate offset variable. This operator also turns out to be an efficient way to (destructively) break an expression up into fixed size chunks. For example, given: my $x = ''; my $str = "A"x100_000_000; This code: $x = substr($str, 0, 5, "") while ($str); is twice as fast as doing: for ($pos = 0; $pos < length($str); $pos += 5) { $x = substr($str, $pos, 5); } Compared with blead, `$y = substr($x, 0, 5)` runs 40% faster and `$y = substr($x, 0, 5, '')` runs 45% faster.
As suggested in Perl#22785
c468cc5
to
0e93328
Compare
|
||
if (SvROK(sv) && do_chop) { | ||
Perl_ck_warner(aTHX_ packWARN(WARN_SUBSTR), | ||
"Attempt to use reference as lvalue in substr" | ||
); | ||
} | ||
|
||
if (do_chop) { | ||
SvGETMAGIC(sv); | ||
tmps = SvPV_force_nomg(sv, curlen); | ||
} else | ||
tmps = SvPV_const(sv, curlen); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're checking SvROK(sv) before calling SvGETMAGIC() which would be incorrect if sv is tied - I think I would have caught that before, but maybe I missed it.
You could move the reference check after the SvGETMAGIC() inside the do_chop conditional to fix it (and better match pp_substr).
BINOPs like PP
|
On Thu, Dec 19, 2024 at 02:35:50AM -0800, bulk88 wrote:
BINOPs like PP
``
if(index($str, 'ZZZZZZ) == -1) {
}
``
XS have no concpext of "G_BOOL" content. There is definently a need to deliver bool contet, from runloop to the XS.
What has this got to do with the proposed OP_SUBSTR_LEFT op?
…--
"You may not work around any technical limitations in the software"
-- Windows Vista license
|
This commit adds
OP_SUBSTR_NIBBLE
and associated machinery for fast handling of the constructions:and
Where
EXPR
is a scalar lexical, theOFFSET
is zero, and either there is noREPLACEMENT
or it is the empty string.LENGTH
can be anything thatOP_SUBSTR
supports. These constraints allow for a very stripped back and optimised version of pp_substr.The primary motivation was for situations where a scalar, containing some network packets or other binary data structure, is being parsed piecemeal. Nibbling away at the scalar can be useful when you don't know how exactly it will be parsed and unpacked until you get started. It also means that you don't need to worry about correctly updating a separate offset variable.
This operator also turns out to be an efficient way to (destructively) break an expression up into fixed size chunks. For example, given:
This code:
is twice as fast as doing:
Compared with blead,
$y = substr($x, 0, 5)
runs 40% faster and$y = substr($x, 0, 5, '')
runs 45% faster.